首页> 外文OA文献 >Incremental Sampling-based Motion Planners Using Policy Iteration Methods

【2h】

Incremental Sampling-based Motion Planners Using Policy Iteration Methods

机译：使用策略迭代的基于增量采样的运动规划器方法

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recent progress in randomized motion planners has led to the development of anew class of sampling-based algorithms that provide asymptotic optimalityguarantees, notably the RRT* and the PRM* algorithms. Careful analysis revealsthat the so-called "rewiring" step in these algorithms can be interpreted as alocal policy iteration (PI) step (i.e., a local policy evaluation step followedby a local policy improvement step) so that asymptotically, as the number ofsamples tend to infinity, both algorithms converge to the optimal path almostsurely (with probability 1). Policy iteration, along with value iteration (VI)are common methods for solving dynamic programming (DP) problems. Based on thisobservation, recently, the RRT$^{\#}$ algorithm has been proposed, whichperforms, during each iteration, Bellman updates (aka "backups") on thosevertices of the graph that have the potential of being part of the optimal path(i.e., the "promising" vertices). The RRT$^{\#}$ algorithm thus utilizesdynamic programming ideas and implements them incrementally on randomlygenerated graphs to obtain high quality solutions. In this work, and based onthis key insight, we explore a different class of dynamic programmingalgorithms for solving shortest-path problems on random graphs generated byiterative sampling methods. These class of algorithms utilize policy iterationinstead of value iteration, and thus are better suited for massiveparallelization. Contrary to the RRT* algorithm, the policy improvement duringthe rewiring step is not performed only locally but rather on a set of verticesthat are classified as "promising" during the current iteration. This tends tospeed-up the whole process. The resulting algorithm, aptly named PolicyIteration-RRT$^{\#}$ (PI-RRT$^{\#}$) is the first of a new class of DP-inspiredalgorithms for randomized motion planning that utilize PI methods.

机译：随机运动计划者的最新进展导致了新一类基于采样的算法的发展，这些算法提供渐近最优性保证，尤其是RRT *和PRM *算法。仔细的分析表明，这些算法中的所谓“重新规划”步骤可以解释为本地策略迭代（PI）步骤（即本地策略评估步骤，然后是本地策略改进步骤），因此随着样本数量的增加，渐近地无限大时，两种算法几乎都可以收敛到最优路径（概率为1）。策略迭代以及值迭代（VI）是解决动态编程（DP）问题的常用方法。基于此观察，最近提出了RRT $ ^ {\＃} $算法，该算法在每次迭代期间对图的那些可能成为最优路径一部分的顶点进行Bellman更新（aka“备份”）。（即“有希望的”顶点）。因此，RRT $ ^ {\＃} $算法利用了动态编程思想，并在随机生成的图上逐步实现了这些思想，以获得高质量的解决方案。在这项工作中，并基于此关键见解，我们探索了另一类动态规划算法，以解决由迭代采样方法生成的随机图上的最短路径问题。这类算法利用策略迭代而不是值迭代，因此更适合大规模并行化。与RRT *算法相反，重新布线步骤期间的策略改进不仅仅在本地执行，而是在当前迭代过程中分类为“有希望”的一组顶点上执行。这倾向于加快整个过程。生成的算法恰当地命名为PolicyIteration-RRT $ ^ {\＃} $（PI-RRT $ ^ {\＃} $），是利用PI方法进行随机运动规划的新型DP启发式算法中的第一个。

著录项

作者
Arslan, Oktay; Tsiotras, Panagiotis;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. ITERATIVE METHODS FOR EFFICIENT SAMPLING-BASED OPTIMAL MOTION PLANNING OF NONLINEAR SYSTEMS [J] . Ha Jung-Su, Choi Han-Lim, Jeon Jeong Hwan International Journal of Applied Mathematics and Computer Science . 2018,第1期

机译：基于有效采样的非线性系统最优运动规划的迭代方法
2. Iterative methods for efficient sampling-based optimal motion planning of nonlinear systems [J] . Jung-Su Ha, Han-Lim Choi, Jeong Hwan Jeon International journal of applied mathematics and computer science . 2018,第1期

机译：基于高效采样的非线性系统最优运动规划的迭代方法
3. Limited range spatial load balancing in non-convex environments using sampling-based motion planners [J] . Beth Boardman, Troy Harden, Sonia Martínez Autonomous robots . 2018,第8期

机译：使用基于采样的运动规划仪的非凸环境中的限量范围空间负载平衡
4. Incremental sampling-based motion planners using policy iteration methods [C] . Oktay Arslan, Panagiotis Tsiotras IEEE Conference on Decision and Control . 2016

机译：使用策略迭代方法的基于增量采样的运动计划器
5. Performance of local planners with respect to sampling strategies in sampling-based motion planning. [D] . Rahman, S.M. Rayhan. 2011

机译：在基于采样的运动计划中，本地计划人员在采样策略方面的表现。
6. Non-iterative sampling-based Bayesian methods for identifying changepoints in the sequence of cases of haemolytic uraemic syndrome [O] . Guo-Liang Tian, Kai Wang Ng, Kai-Can Li, -1

机译：基于非迭代采样的贝叶斯方法用于筛查血液解性血症综合征序列中的序列
7. Incremental Sampling-based Motion Planners Using Policy Iteration Methods [O] . Arslan, Oktay, Tsiotras, Panagiotis 2016

机译：使用策略迭代的基于增量采样的运动规划器方法

Incremental Sampling-based Motion Planners Using Policy Iteration Methods

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅